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Two main classes of jet clustering algorithms, cone and kt, are briefly discussed. It is argued 
that the former can be often cumbersome to define and implement, and difficult to analyze in 
terms of its behaviour with respect to soft and collinear emissions. The latter, on the other 
hand, enjoys a very simple definition, and can be easily shown to be infrared and collinear safe. 
Its single potential shortcoming, a computational complexity believed to scale like the number 
of particles to the cube (iV 3 ), is overcome by introducing a new geometrical algorithm that 
reduces it to NlnN. A practical implementation of this approach to fct-clustering, Fast Jet, 
is shown to be orders of magnitude faster than all other present codes, opening the way to 
the use of fa-clustering even in highly populated heavy ion events. 

High energy events are often studied in terms of jets. While a "jet" is in principle just a 
roughly collimated bunch of particles flying in the same direction, it takes of course a more 
careful definition to make it a tool for an accurate analysis of QCD. In particular, in order to 
be able to compare the experimentally observed jets to theoretical predictions, one must ensure 
that the measured quantity is "soft and collinear safe" , meaning that the addition of a soft or 
a collinear parton does not change its value. Only for this type of quantity can higher order 
calculations in QCD give sensible results. 

While jets have been discussed since the beginning of the '70s, the first modern definition of 
a soft and collinear safe jet is due to Sterman and Weinberg^. Their jets, whose definition was 
originally formulated for e + e~ collisions, were of a kind which became successively known as 
'cone-type'. They have been successively extended to hadronic collisions, where cone-type jets 
are based on identifying energy- flow into cones in (pseudo)rapidity rj = — In tan 8/2 and azimuth 
4>, together with various steps of iteration, merging and splitting of the cones to obtain the final 
jets. The freedom in the details of the clustering procedure has led to a number of definitions 
of cone-type jet clustering algorithms, many of them currently used at the Tevatron and in 
preliminary studies of LHC analyses^. However, cone jet-finders tend to be rather complex: 
different experiments have used different variants (some of them infrared unsafe) , and it is often 
difficult to know exactly which jet-finder to use in theoretical comparisons. 
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Partly in order to overcome these difficulties, at the beginning of the '90s cluster-type jet- 
finders where proposed. They are generally based on successive pair-wise recombination of 
particles, have simple definitions and are all infrared safe . The most widely used of them is the 
k t jet-finder^, defined below. Among its physics advantages are (a) that it purposely mimics 
a walk backwards through the QCD branching sequence, which means that reconstructed jets 
naturally collect most of th e pa rticles radiated from an original hard parton, giving better 
particle mass measurements ^El and gaps-between-jets identification^ (of relevance to Higgs 
searches); and (b) it allows one to decompose a jet into constituent subjets, which is useful for 
identifying decay products of fast-moving heavy particles (see e.g.^J) and various QCD studies. 
This has led to the widespread adoption of the kt jet-finder in the LEP (e + e~ collisions) and 
HERA [ep) communities. 

The kt jet-finder, in the longitudinally invariant formulation suitable for hadron colliders, is 
defined as follows. 

The kt jet-finder 

1. For each pair of particles i, j work out the kt distance dij = 
min(fc^, kfj)Rij with i??- = (rji - r]j) 2 + (fc - 4>j) 2 , where k ti , r/i and 0» 
are the transverse momentum, rapidity and azimuth of particle i; for 
each parton i also work out the beam distance diB = k\. 

2. Find the minimum <i m i n of all the dij, diB- If ^min is a dij merge particles 
i and j into a single particle, summing their four-momenta (alternative 
recombination schemes are possible); if it is a diB then declare particle 
i to be a final jet and remove it from the list. 

3. Repeat from step 1 until no particles are left. 

One apparent drawback of this algorithm is its computational complexity, originally be- 
lieved to scale like ./V 3 , N being the number of particles to be clustered. This complexity leads 
to concrete implementations which become slow as N grows, making the use of /^-clustering 
impractical in environments where large numbers of particles are produced in the final state, 
like hadron-hadron or, even more spectacularly, ion-ion collisions. 

We show here that this computational complexity can in fact be reduced to N\nN, opening 
the way to a much more widespread use of the kt jet-finder^. 

To obtain a better algorithm we isolate the geometrical aspects of the problem, with the 
help of the following observation (see^for its proof): If i, j form the smallest du, and ku < ktj, 
then Rij < Rn for all £ ^ j, i.e. j is the geometrical nearest neighbour of particle i. 

This means that if we can identify each particle's geometrical nearest neighbour (in terms of 
the geometrical Rij distance), then we need not construct a size-iV 2 table of dij = min(A; 2 i , fcjr)i?|-, 
but only the size-A r array, dig., where Gi is i's geometrical nearest neighbour^. We can therefore 
write the following algorithm^ 

The Fast Jet Algorithm 

1. For each particle i establish its nearest neighbour Qi and construct the 
arrays of the dig i and diB ■ 

2. Find the minimal value d m i n of the dig,, diB- 

3. Merge or remove the particles corresponding to d m i n as appropriate. 

4. Identify which particles' nearest neighbours have changed and update 
the arrays of dig, and djB- If any particles are left go to step 2. 

6 We shall drop 'geometrical' in the following, speaking simply of a 'nearest neighbour' 
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Figure 1: Left: the Voronoi diagram (black lines) of ten points in a plane, numbered 1...10. Superimposed, in 
red, is the Delaunay triangulation. Right: CPU time taken to cluster N particles for various jet-finders. FastJet 
is available at http://www.lpthe.jussieu.fr/~salam/fastjet 



This already reduces the problem to one of complexity N 2 : for each particle we can find its 
nearest neighbour by scanning through all O (N) other particles [O (N 2 ) operations]; calculating 
the d{g i , diB requires O (N) operations; scanning through the dig { , d{B to find the minimal value 
^min takes O (N) operations [to be repeated N times]; and after a merging or removal, updating 
the nearest neighbour information will require O (N) operations [to be repeated N times]. 

We note, though, that three steps of this algorithm — initial nearest neighbour identification, 
finding d m [ n at each iteration, and updating the nearest neighbour information at each iteration 
- bear close resemblance to problems studied in the computer science literature and for which 
efficient solutions are known. An example is the use of a structure known as a Voronoi diagram^ 
or its dual, a Delaunay triangulation (see fig. to find the nearest neighbour of each element of 
an ensemble of vertices in a plane (specified by the rji and (pi of the particles). It can be shown 
that such a structure can be built with O (NlnN) operations (see e.g. and updated with 
O (In N) operations'**-*- 1 (to be repeated N times). More details, concerning also other steps in 
the algorithm, are given in^. The final result is that both the geometrical and minimum-finding 
aspects of the kt jet-finder can be related to known problems whose solutions require O (N In N) 
operations. 

The FastJet algorithm has been implemented in the C++ code FastJet. The building and 
the updating of the Voronoi diagram have bee n pe rformed using the publicly available Computa- 
tional Geometry Algorithms Library (CGAL)E3 i n particular its triangulation components'*^! 
The resulting running time for the clustering of N particles is displayed in fig. ^ It can be seen 
to be faster than all other codes currently used, both of cone or kt type. Analyses of events 
with extremely high multiplicity, like heavy ion collisions at the LHC, are now feasible, their 
clustering taking only about 1 second, rather than 1 day of CPU time. 

The speed of FastJet does more, however, than just making analyses with a few hundred 
particles faster, or those with a few thousand possible. In fact, it allows one to do new things. 
One example is the possibility of calculating the area of each jet by adding to the event a 
large number of extremely soft 'ghost' particles, and counting how many get clustered into 
any given jet. This approach is of course computationally heavy, and would be unfeasible - 
or at least extremely impractical - with a slower jet-finder. Fig. [2] shows the result of this 
procedure on a LHC event made of one hard and many soft jets. Estimating jet areas is of 
course not interesting by itself, but as an intermediate step towards performing an event-by- 
event subtraction of underlying event /minimum bias energy from the hard jets. This work is 
presently in progress^*. 
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Figure 2: A simulated "typical" event at high luminosity at the LHC. Left: A single event with two hard jets 
has been combined with about 10 softer events. Right: Very soft 'ghost' particles have been added in order to be 

able to quantify more precisely the area of each jet. 
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